Metric structure of random networks 
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I. INTRODUCTION 
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^ ' We propose a consistent approach to the statistics of the shortest paths in random graphs with 

a given degree distribution. This approach goes further than a usual tree ansatz and rigorously 
accounts for loops in a network. We calculate the distribution of shortest-path lengths (intervertex 
distances) in these networks and a number of related characteristics for the networks with various 
degree distributions. We show that in the large network limit this extremely narrow intervertex 
distance distribution has a finite width while the mean intervertex distance grows with the size of a 
network. The size dependence of the mean intervertex distance is discussed in various situations. 
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The issue of discrete random geometries arises in numerous problems of quantum gravity |j],§|,|||, string theory, 
condensed matter physics (e.g., branched polymers QJI|), and classical statistical mechanics. In these problems, a 
' O fundamental question about the global structure of a random network (that is, a statistical ensemble of matrices) 
and its consequences naturally arises. This question exists even when the notion "network" is not used directly 
in the description of such a problem. As a simple example, we mention the backgammon ( "balls-in-boxes" ) model 
which was considered as a mean-field description of simplicial gravity Formally speaking, the formulation of this 
simple model does not contain the notion "network" . However, the statistical ensembles that are produced by the 
backgammon model can be easily related to random networks with a complex distribution of connections, which is 
another representation of the model. 

In this paper, we study the global structural organization of a wide class of complex random networks, or speaking 
more strictly, we study their metric structure. An intervertex distance in a network is naturally defined as the length 
of the shortest path between a pair of vertices. So, the statistics of intervertex distances, that is an intervertex 
t-H ' distance distribution, actually determine the metric structure of a random network. This distribution is the basic 
structural characteristic of random networks which are under extensive study by physicists for the last years (e.g. see 
Refs. Networks with fat-tailed degree distributions show a number of exciting effects (see Refs. |g,0|2|) 

and are especially intriguing. (By definition, degree is the total number of connections of a vertex, which is called 
sometimes "the connectivity of a vertex"; a degree distribution is the distribution of degrees of vertices.) 

The intervertex distance distribution was obtained only for several very specific graphs. Even for basic uncorrelated 
. random networks with a given degree distribution, the first moment of the intervertex distance distribution, that is the 
mean intervertex distance or the mean shortest-path length of a network, was only estimated jl5],[l6| . This estimation 
used the important fact that these networks are tree-like locally. This is not true at a large scale. In the recent paper 
O ■ jl3| the presence of loops was taken into account (see also Ref. fTi]) for estimating the mean intervertex distance of 
networks with a fat-tailed degree distribution. 

In this paper we propose a rigorous approach which takes into account both the locally tree-like structure of 
uncorrelated networks and the presence of loops on a large scale. This approach allows us to explicitly calculate the 
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intervertex distance distribution and its moments, and to describe their dependence on the size of a network. 

Our approach is valid for uncorrelated random networks with a given degree distribution. These basic "equilibrium" 
networks are graphs, which are maximally random under the constraint that their degree distribution is given. In 
graph theory these networks (loosely speaking, one of their versions) are called labelled random graphs with a given 
degree sequence or the configuration model |17],[l8|,|l9],^o| . These networks are a starting point for the study of the 
effects of complex degree distributions, and so are of fundamental importance. 

The (uncorrelated) random graph with a given degree distribution can be constructed in the following way. Take 
N vertices. Attach to the vertices "spines", {qt}, i — l,...,N, according to a given sequence {N(q)}, q = 0, 1, 2 . . ., 
where N = J2 q N (q), so that the vertices look like a family of "hedgehogs". Connect various spines at random. 
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This procedure provides the maximally random graph with a given degree distribution II (q) 
lack of generality we can set the number of zero-degree (i.e. isolated) vertices to be zero, IT (0) 
topological properties of such a networks are governed by the parameter |lj||2l],|2^] : 



= N (q) /N. Without 
= 0. The main global 




(1) 



which_is the ratio of the mean numbers of the second- and first-nearest neighbours of a vertex in the network, q 
and q 2 are the first and the second moments of the degree distribution. For z\ < 1, all the connected components of 
the network remain finite in the infinite network limit (by definition, this is a thermodynamic limit). If Z\ > 1, the 
giant connected component arises, whose size is proportional to the size of the whole network. The condition for the 
emergence of the giant connected component plj,p2|], z\ > 1, may be written as: 



So, the giant connected component is formed only if the fraction of "dead ends", 11(1), is sufficiently small. If "dead 
ends" are absent, giant connected component exists and, in the thermodynamic limit, includes almost all vertices. 
(We do not consider the case when the network consists solely of the vertices of degree two). 

We present a consistent approach allowing rigorous calculations of the intervertex distance statistics within a giant 
connected component in such random networks. The main object we will consider is the sizes of connected components 
of a vertex in the graph (see a schematic view of the structure of an uncorrelated network in fig. [I]) . The n-th order 
connected component of a vertex consists of all the vertices within the first n coordination spheres of the vertex; in 
other words — the distance to the central vertex in the n-th connected component does not exceed n. Obviously, in a 
random graph the size of the connected component is a fluctuating random variable. 



FIG. 1. Connected components of a vertex. Three first components, shown inside the shaded area, are trees. The higher 
ones, shown outside the shaded area, are assumed to contain a finite fraction of the network, and, therefore, may contain closed 
loops. 

The idea of our method is to construct a recurrent relation expressing the size distribution of the n + 1-th connected 
component through that of the n-th connected component. This relation can be derived in two limiting cases: when 
the size of a connected component is negligibly small compared to the size of a network and when a connected 
component is a finite fraction of an infinitely large network. Sewing together the results in these limiting cases yields 
the complete set of connected component size distributions. In particular, this allows us to obtain the intervertex 
distance distribution in a random graphs. 

The main results of this paper are as follows: 

(1) We find an explicit expression for the mean intervertex distance, In TV/ In zi + const. In Ref. |l5| this result was 
obtained as an estimate, here we present an exact result with an exact constant. 

(2) We obtain the form of the intervertex distance distribution and show that in the networks under consideration, 
almost all vertices within a giant connected component are nearly equidistant. More precisely, we found that 
the mean square deviation of an intervertex distance is finite in the infinite network. 



</ 2 - 2q = ]T q (q - 2) II (q) = £ q (q - 2) II (q) - II (1) > . 



(2) 




2 



To find the intervertex distribution function, one has to solve the functional equation, whose form is determined 
by the degree distribution in the network. Sometimes this can be done explicitly (two examples are considered in 
the paper). However, even in a general case, all essential features of the distance distribution can be reproduced 
analytically. 

First, the cumulative distance distribution Q (d, A) (the probability that the intervertex distance is less than or 
equal to d), appears to be actually the function of I = d — d (A), Q (I, A) = Q(l), where d (A) = In (AN) / Inzi is the 
average intervertex distance, and A is a number of the order of unity. 

Second, we find both the asymptotics of Q(l) at large deviations I of distances from the mean value, positive and 
negative. At large negative I, this asymptotics is determined by the first two moments of the degree distribution, 
Q (I) ~ z\. The asymptotics of Q (I) on the other side — at large positive I — is determined by vertices with the lowest 
degrees. Obviously, Q (I) — > as I — > oo, where m x is the capacity of the giant connected component, because 
is precisely the probability that a randomly chosen pair of vertices is interconnected. If the lowest degree is either one 
or two, then the asymptotics of — Q (I) at I — > +00 decreases exponentially with a linear preexponential factor. 
If the lowest degree of the vertex is three or higher, then the asymptotics decreases faster than an exponent. 

The scheme suggested herein works only if the parameter z\ is finite, which means the convergence of the first 
and second moments of the degree distribution in the thermodynamic limit. This is not the case if this degree 
distribution asymptotically behaves as II (q) ~ q 1 with 7 < 3 at large degrees q. We have studied the case of 
II (q) q 1 exp (— g/go), 2 < 7 < 3 with large but finite value of the cut-off parameter go- As a result, we have found 
that in this case d ~ In A/g , and Q(l) is independent both of the system size and the cut-off parameter. Again, 
the mean square deviation of intervertex distances is of the order of unity, and all the vertices in the giant connected 
component are nearly equidistant. 

This result is valid in the limit N — > 00, when we assume that the cut-off parameter is large. In reality, in the 
finite-size networks a degree distribution has some natural size-dependent cut-off. How the cut-off parameter varies 
with the size of the network, depends on the details of a construction procedure. For example, in the configuration 
model the position of the cut-off depends on how the limit A (q) /N — > n (q) is approached. We show that the picture 
described above remains valid, if the cut-off parameter grows with the network size sufficiently slowly, namely, not 
faster than In go ~ In A/ In In A. Thus, for "scale-free" networks with 2 < 7 < 3, we arrive at the stable distribution 
of intervertex distances around their steadily growing mean value d(N) if the construction procedure ensures not very 
fast growth of the degree distribution cut-off with the network size. In this situation, d grows with A slower than 
In A but faster than In In A. 

The paper is organized as follows. In Section [0] we define main notions. In Section III we remind and essentially 
refine the approach of Ref. Jig] , which is based on the tree ansatz and valid for finite-size connected components 
in the infinite network. In Section IV we present the recursion relation between the sizes of the n-th and n + 1-th 
connected components in the limit, when these sizes are both infinite, taking into account loops. In Section [v| we 
explain, how the results of two previous sections can be sewed together in the region, where the size of a connected 
component is large compared to unity but small compared to the size of the whole network. In Section VI the results 
of previous sections are briefly summed up and general results for various quantities of interest are presented. In 



Section VII, as an illustrative example, we present an e xact analytical solution for the uncorrelated network with the 
degree distribution n (q) = Cg _1 £ 9 , C < 1- I n Section VIII the network with the degree distribution n (q) ~ g~ 7 £ 9 , 
2 < 7 < 3 and £<1, 1 — £<Clis studied. In Section [X we summarize the results obtained in the paper and discuss 
the size dependence d(N) in situations when it grows slower than In A, e.g. as In In A or In A/ In In A. Some technical 
details are presented in two Appendices. 



II. DEFINITIONS 



A graph consists of vertices connected by edges. Undirected graph is described by its symmetric adjacency matrix 
a. Elements of this matrix are either a%j = a,ji = 1, if vertices i and j are connected, or a,j = aji = otherwise. 
We consider only graphs with an = 0, that is ones without "tadpoles" — edges with both ends attached to the same 
vertex. The degree of a vertex, gj, (sometimes it is called the vertex connectivity) is the number of edges, attached to 
the vertex: g^ = ■ a.y . Random networks are usually described in terms of a statistical ensemble: the set of graphs 
G with corresponding statistical weights — a non- negative function P (g), g G G, defined on this set p3| , p4| , p5 26 1. 

Let us consider a statistical ensemble of undirected graphs, each of which contains A — » 00 vertices. Let us 
choose an ensemble characterized by a degree distribution n (q) and maximally random otherwise. Several ensembles, 
equivalent in the thermodynamic limit A — > 00, may be used |25|. For example, one can use a "microcanonical one", 
usually referred to as the "configuration model" . Here we ascribe equal statistical weights to all possible graphs with 
A = Yl q A (q) vertices, A (q) of them have a degree g, g = 1,2,... (without lack of generality we can exclude the 
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possibility that a vertex is of zero degree). We assume, that in the thermodynamic limit, TV — > oo, N (q) /N — > II (q). 
For this ensemble, we have the degree distribution: 



(6 K (q i -q)) = ^(^2S K (q l -q)^=U(q) . (3) 

One can show that in the thermodynamic limit even degrees of the nearest-neighbour vertices are uncorrelated in 
such networks: 

^ E M» " «) to - ^')) = f" n (?) n - ( 4 ) 

where q — 2L/N is the average vertex degree. We introduced the notation 8 k (q — q') for the Kronecker symbol 8 qq >. 
Relation (|]) plays a crucial role. In fact, the scheme presented here is based on this relation. 

We call the set of vertices, for which the shortest distance from some vertex equals n, the n-th shell of this vertex. 
The union of the shells of a vertex from zeroth to n-th one inclusively is called the n-th (connected) component of the 
vertex. 

Following Jl5|, it is convenient to use the degree distribution in ^-representation (sometimes this object is called 
the generating function of the distribution): 



1 N 00 

^w = ^E^ i ) = E n to^- ( 5 ) 

i=l q=0 

Another useful quantity is what may be called an "edge multiplication" distribution function IL (q) . This is the 
conditional probability that in a connected pair of vertices, a vertex has its degree equal to q + 1: 

n , s jagix (Qi-g- 1)) feMif (gi-g-1)) ( g +l)n(g+l) 

n i (q) = / — v = 7 \ = 5 W 

or, in Z-representation, 

We make use that in our ensemble all pairs of vertices are statistically equivalent. IT (q) may also be thought of as 
the probability that, choosing a random edge (but not a vertex!), and going along it in some of directions, we arrive 
at a vertex which has a degree equal to q + 1 and therefore, there exist q different possibilities to move further. 



III. MICROSCOPIC COMPONENTS 



By "microscopic components" of a vertex we mean the components of a size negligible compared to the size of the 
network. 

The role of (f>i (x) may be understood from the following reasoning. Let us choose a random vertex i of degree qi. 

Assume that its n-th connected component is a tree. Then it consists of qi trees generated by every edge attached to 

(i) 

the vertex. Let Sit j = 1 • ■ ■ qi be the number of vertices in such a tree. Obviously, the total number of vertices in 
the n-th component M n = 1 + Y^=i Sn \ and S„ \ by the definition of the statistical ensemble under consideration, 
are equally distributed, independent (in the thermodynamic limit) random variables. For the distribution of M n , we 
have in Z-representation: 

$„ (x) = (x M -) = x((x s ") q ) = xq> [F n (x)] , 

where F n (x) — (x SnS ) is the distribution function of S n , the number of vertices in the n-th order tree, formed 

by a randomly chosen edge. We have: S n +i = 1 + Ylf=iSn \ where the distribution function of q\ is (f>i{x) in 
Z-rcprcscntation. Then we obtain finally for the size distribution of the n-th component: 
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$ n (x) = xcj) [F n (x)] , 
F n+ i (x) = xcj>i [F n (x)} , Fx (x) = x . 



(8) 



As n — > oo and |x| < 1, F n (x) — > _F (x), where F (x) describes the size distribution of finite components attached 
to a randomly chosen edge. Then H (x) = (j> [F (x)] is the finite-component size distribution. Note that t c = F (1) = 
lim x _»i_o F (x) is the stable fixed point of the recursion relation t n+ \ = <pi (t n ). Taking into account that <pi (x) 
is monotonously increasing and convex downward as < x < 1, and 4>i(l) — 1, one can conclude that t c = 1 if 
^ (1) = zi < 1, and t c < 1 if Z\ > 1. In the latter case we have J? (1) = (t c ) < 1. But iJ (1) is the probability that 
a randomly chosen vertex belongs to some finite component. This means that as 



(9) 



a giani connected component appears in the network. Its capacity (the probability that a vertex belongs to the giant 
component) is 



moo = l — <f> (t c ) 



(10) 



The average number of vertices in the n-th connected component of a vertex is (S n ) — $' n (!) = ! + zqF^ (1), 



z = cj)' (1). One can easily find from Eq. (§) that F' n (1) = (z r { - 1) / (zi - 1) 



oo, where z\ = (\J X (1) 



4>" (1) /4>' (1). Let us introduce, instead of F n , the sequence of functions /„ that is defined as 

Fn (^-) f n 



Zl 



1 X 



The recursion relation then turns into 
fn+i (y) = exp 
As n — > oo, it may be replaced with 



fn y 



„n+l 



h(y) 



fn+i (y) = <j>i [fn (y/zi)} , h(y) 



(11) 



From cf>i (1) = 1 it follows that f n (0) = F n (1) = 1. Also, we have now f' n (0) = —1 independent of n. Taking into 
account that <pi (x) is analytic, monotonically growing and convex downward at < x < 1, one can prove that the 
sequence /„ converges as n — > oo to some function f(y). This latter may be found from the stationarity condition: 



f(y) = <h [f{yhi)\ ; / (o) = i, /'(o) = -i 



(12) 



The above conditions determine / (y) uniquely. One can check this, e.g. taking subsequent derivatives of Eq. ( |T2| ) at 
y = 0, which allows us to express f^ (0) through /W (0), I < k. Then we have asymptotically at n — > oo: 



(x) 4> 



f 



Zl 



7 n 1 

^ln- 
1 x 



(13) 



The distribution function for the size of n-th connected component, M n , in the usual representation, P n (M), is the 
inverse Z-transform of $„: 



Pn{M) 



■ 1-8 
Tl — > OO 



(14) 



Taking into account Eq. (|T^), we have in the limit 

P n (M) -» (zi - 1) zr> [{zi - 1) Zi n M] 



p(s) 



+200 + (5 



ioG-\~5 



^-.g{y)e s \ g{y) = <P[f{y)] 



(15) 



Note the order of limiting transitions adopted in this section: lirrin^oo lini/v- 
and only then the order of a connected component, ri, is tended to infinity. 



The first is the thermodynamic limit, 
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Several essential assumptions had been made during the derivation of the above formulae. First, it was required 
that (/>' (1) and <f>" (1) are finite, which means that the degree distribution II (q) has finite first and second moments. 
This will be assumed everywhere below. And second, the graph must be a tree. In fact, it is sufficient that the n-th 
connected component of a vertex is a tree. But almost all n — th components of infinite uncorrelated random graph 
are trees at finite n. Indeed, the probability that two vertices in the n-th component are connected by an edge is 
proportional to the ratio of the total numbers of edges inside and outside this component. If n is fixed and N — ► oo, 
this ratio scales as M n /N. Our final result for the component size distribution, Eqs. (fl2"|), (|l5|), is valid in the limit 
n — > oo, which must be taken after the "thermodynamic" limit N — > oo. We emphasize that the order of limits is 
extremely essential here. 



IV. MACROSCOPIC COMPONENTS 



Now we assume a different situation: the size of the graph N and the order of a connected component, n, simulta- 
neously tend to infinity. At the same time, we assume that the distribution of the capacity, m n = S n /N, of the n-th 
connected component, p n (m), tends to some limiting N- independent distribution. From the results of the previous 
section it follows that we have to assume that z™/N remains constant. 

In this case the n-th connected component is no longer a tree. However, in this case it appears to be possible to 
derive an exact (in the thermodynamic limit) relation between p n (m) and p n +i ( m )- The idea is to use the law of large 
numbers. Assume we have the n-th connected component with M n = Nm n vertices. Also, we assume that the number 
of edges, which connect vertices inside the n-th component to vertices outside this component is L n — Nl n . Due to 
the randomness of the graph, m n+ i and l n +i would be fluctuating variables even if m n and l n are fixed. However, in 
the thermodynamic limit fluctuations of intensive variables m n +i and l n +i tend to zero. So, the evolution of the n-th 
connected component, as n is growing, is governed by 2-d mapping: (m n , l n ) — y (mn+i, ln+i)- 

This mapping may be constructed as follows. Let H„ (q), or <p n (y) in Z-representation, be the degree distribution 
of vertices outside the n-th component. (Do not mix (f> n= i(y) and 4>x(y) = <j>' (y) / <fr' (1) .) Their total degree is 
N (1 — m n ) <fi' n (I). Nl n edges of this number are to be chosen to connect with vertices of the n + 1-th shell. All 
such choices are equiprobable, because of the nature of the statistical ensemble of graphs under consideration. So, 
the probability that a vertex of a degree q outside the n-th component is not connected to a vertex inside the n-th 
component equals (1 — c n ) q , where c„ = l n j [(1 — m n ) <f>' n (1)]. Then the fraction of vertices, remaining outside the 
n + 1-th shell, 1 — m„ + i, is given by 

, W " +1 = Vn„ (q) (1 - c n ) q = <f> n (1 - cO . (16) 
1 rn n ^ 

Also, we have a recursive relation for the degree distribution function: 

n n+1 (?) = -H n (q) (1 - c n f , (17) 

1 - m n+ i 

or, in Z-representation, 

, / v 0n [(l -c n )y] , 1SA 
<?Wi (y) = — 7— t- s- ■ (18) 

<Pn (l—Cn) 

Repeatedly applying Eq. ([Lsl), introducing t n = (1 — c„_i) (1 — c„_2) ■■ ■ (1 — ci), and using 4>„—i(x) = <f>(x), one can 
write: 

<j>{t n y) 

4>n (y) = T7T s ■ (19) 

From Eqs. (|l^) and (|l9|) we obtain the relation 

1 - m n = (tn) , (20) 
relating t n and m n . From the definition of t n we obtain the following equation: 

+ 1 = 0- ~ C ") t n = t n — — — = t n — — 7" , (21) 

(l-m n )</4(l) 9 {tn) 



G 



where Eqs. (|l9|), (20), and the definition of c n were used. Eqs. © and (§!]) express m n +i in terms of m n and l n . 

The total degree of vertices outside the n-th component may be written as N (1 — m n ) cj>' n (1) = Nt n (f>' (t n ) 
and outside the n + 1-th one — as Nt n +i4>' (t n +i)- Therefore, the total degree of vertices in the n + 1 shell is 
TV (t n ) — t n +i(j>' (t n +\)]. Of this number, Nl n edges are attached to vertices in the n-th component. Each 
of remaining "free" N [t n <fi' (t n ) — t n +\<\>' (t n +i) — In] edges may be attached either to a vertex outside the n + 1-th 
component, or to some other vertex in the n + 1-th shell (see fig. |l|). The respective probabilities relates as the total 
degree outside the n + 1-th component, Nt n +i<p' (t n +i), and the number of "free" edges in the n + 1-th shell. So, we 
have for the number of edges Nl n +i, going out from the n + 1-th component: 



ln+1 — [t n (f>' (t n ) ~ t n +\<$> (t n +l) — In] , + ,, TL ~\ + T~ ~ t n +\4> (tn+l) 



<t>' (tn+l) 
<j>' (tn) 



(22) 



where Eo^(|T|) was used. 

Eqs. (H) and (§|) define the 2-d mapping (t n ,l n ) — * (t n +i,ln+i), where t n is related to the capacity of the n-th 
component m n by Eq. (p0|). This mapping can be reduced to a 1-d one, because the "first integral" of the 2-d mapping 
can easily be found. Namely, using Eq. (f22|), one can express l n through t n and t n -\, and substitute this expression 
into Eq. @. The result is 



# (t n ) 0' (tn- 



(23) 



Repeatedly applying this relation and taking into account the fact that the (limiting) starting point of (t n , l n ) sequence 
is (1,0), we obtain the following recursion relation: 



. g (tn) _ , ( , v 

+ 1 _ (j)'(l) ~ ^ ^ ' 



(24) 



V. SEWING TOGETHER 



In the last two sections we described the recursion relations in the problem under consideration in two limiting 
cases. Now we must sew them together. 

Let us define Gi (t) through the recursion relation G;+i (t) — <fii [Gi (t)] with the initial condition Go (t) = t. 
Introducing /; (x) as G/ (t) — fa (z[ ln i), we have fi + i (x) — 4>i [/; (x/zi)], jo (x) — e~ x , which exactly coincides with 
Eq. (inj). Then in the limit I — > oo we have Gi (t) — f (z^ln^), where / (x) = lim/^oo // (x) must be found from 
Eq. (|l2|). This is the same function as that in Eq. jl^). 

Iterating Eq. (|2^) I — » oo times yields 

tn+l =f(z[lnp\ =/ [4 (!-*»)] ■ (25) 

Here, it must be assumed 1 — 1„ — > together with I — > oo. Distribution functions P n (t n ) and P n +i(t n +i) are connected 
by the relation P n +i (t n +i) dt n +i = P n (tn) dt n - Therefore, we obtain 

P n+ l (t) - Z? \f- V (t)\Pn [1 - Z^f^ (t)] : (26) 

where / _1 and are an inverse function and its derivative. 

As n -> oo and N — > oo under the condition z-^"^ — > 0, the distribution of the capacity of the n-th connected 
component m„ = M n /N can be obtained from Eq. (|l5|). But in this limit we have from Eq. (|20|): m„ = M n /N = 
1 — (£„) = z (1 — t n ), zq = 0' (!)• Then, we obtain: 

P n (t) = z (z t - 1) z^Np [z (z! - 1) z± n N (1 - t)] . (27) 

Substituting Eq. (|27j ) into Eq. (|2^) and denoting n + / as n yields finally: 

Pn (t) = V n If' 1 ' (t)\p [v n f~ l (*)] i = Z (zx - 1) Z^N . (28) 

This formula is valid if N — > oo, n — > oo without any restriction on the order of the limits. 
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VI. GENERAL RESULTS 



Thus, we suggest a regular procedure for calculating the statistical properties of intervertex distances in a random 
network. This procedure is valid for any large graph with uncorrelated vertices, provided that the degree distribution 
IT (q) has finite first and second moments. Quantities of interest may be expressed in terms of the function g (y) and 
its inverse Laplace transform p (x). To obtain them one has to perform the following steps: 

1. Calculate the Z-transform of the distribution function (j) ( x ), Eq. (Q), and fa (x) = (j)' (x) /(f)' (1). 

2. Find f (y), which is the solution of the equation / {z\y) = fa [f (y)], where z\ — fa (1), with the conditions 
/(0) = l,/'(0) = -l. 

3. Obtain g(y) = fa[f(y)]. 

4. Calculate p (x), which is the inverse Laplace transform of g (y), Eq. (p"5|). 

The most nontrivial is step || — no general methods for the analytic solution of such functional equations are known. 
However, asymptotic behaviour of / (y) may be easily extracted. At y = 0, we have / (y) = 1 — y + o (y). Therefore, 
g(y) = 1 — z y + o(y), z = fa (1). As y — > +oo, we have / (y) — > t c , where < t c < 1 is the root of the equation 
tc = fa (tc)- At large positive y, one can write f (y) = t c + h (y), where h(y) — > when y — > +oo. Then Eq. ( |l2] ) can 
be linearized with respect to /i, which gives: 

h (z iy ) = z c h (y) , (29) 

where z c = fa x (t c ) < 1. Looking for the solution in the form h{y) = Ay~ a , one can easily obtain the exponent 
a = — In z c j In zi > 0. Then we obtain asymptotic behaviour of g (y) at large positive y: 

g(y) = <f>(t c )+g(y) = 1 - + moo g (y) , g{y) ~ By- a , a = - (1/Zc) . (30) 

in zi 

For p (cc), the inverse Laplace transform of g (y), we have: 

I /"OC 

ttep (x) = p (0) = 1 , / da; a;p (x) = g' (0) = z = <t>' (1) ■ (31) 



From g (+oo) = 1 — moo it follows that p (x) has a 5-functional part, p (x) = (1 — moo) 8 (x) + m-ooP (x), p (x) and g (y) 
are related through the Laplace transform. From the asymptotic expression for g (y) at large y it follows the one for 
p (x) at small x: 

p(x)^-^x a -\ (32) 
r(a) 

Various physical quantities may be expressed in terms of the functions g(x) andp(x). For example, the distribution 
functions V n (m) of the relative size of the n-th connected component, m n = M n /N can be expressed from Eq. ( p^ ) 
and the relation m n = 1 — (t n ). We have: 

V n (m) = v n \g~ v (1 - m)\p \v n g~ x (1 - m)] , (33) 

where g^ 1 is the function, inverse to g(x). One can write V n (m) = (1 — moo) 5 (m) + m QC V n (m), where the first term 
corresponds to finite connected components of the graph, and the second one — to the giant connected component. 
The order of a connected component, n, and the size of the graph, N, enter in the distribution functions in the 
combination z/„, which may be written as 

„ - r «o-n „ _ In [gO {Zl ~ 1) jV] 

v n —2, , no — . (,o4j 

In zi 

Therefore, "P„ (m, N) = V (n — no, m). If n < no, tiq — n ^> 1, one can write 

V {n — no, m) = z™°~ n p (z"°~ n m) . (35) 

This limit corresponds to the sizes of connected components being infinitely small compared with the graph size. In 
this case, for m -C z"~™° the small-size asymptotics of the distribution function is 
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B 



V (n - n , m) = -rf^-z^^m*- 1 



(36) 



In the opposite limit, n > no, n — tiq 3> 1, the contribution of the giant connected component to the distribution 
function is concentrated near to = rrioo. Here one can write, inverting the asymptotic expression (|30|) for g: 



V (n — tiq, to) = — 



aB 



m m - to 



B 



-l/a-l 



B 



-l/a 



(37) 



In this case, for z™ "° <C moo -m< 1, to have: 

P (n — no, to) ' 



r (a + 1) \J7ico — m 



(38) 



Let Q„ be the probability that two randomly chosen vertices are separated by a distance less than or equal to n. 
In fact, this is a function of u n , or, equivalently, of n — no, see Eq. (|34|). That is, we have Q n = Q{n — no). This hull 
function, Q(l), can be expressed as 



2(0 



(39) 



Here we used Eq. (|H) and introduced x = z 1 l g 1 (1 — to) as the integration variable. At I < 0, |Z| 3> 1, we have 
g [z[x) = 1 — zqz[x in the actual region of integration. Then, taking into account Eq. (31), we obtain 



2(0 = 4*' 



(40) 



As I — n — no — > +oo, the multiple in the square brackets in the integral in Eq. (|39|) becomes equal to everywhere 
except at x = 0, where this multiple is zero. Then we have: 



/>OC 

Urn Q(l) = m oc / dxp(x) 
^+°° J+o 



(41) 



Here the (^-functional part of p (x) is excluded from the integral. This result is obvious — the distance between two 
vertices is less than infinity if both the vertices belong to the giant connected component. One can show (see Appendix 
A) that for large positive I, 



where 



2(0 



r(a + i) 

B\n(l/z c ) 
a 

Bln(l/z c ) J Q 



1 - 



B 2 In (l/z c ) 
r(a + l) 



(42) 



dxx a [2 lnx — ip (a)] [xp f (x) — (a — 1) p (x)] 

i 

dy y"" 1 [2 lny - ip (a)} [yg' (y) + ag (y)] , 



(43) 



where ip (a) = V (a) /V (a). 

It is convenient to characterize the distribution of distances in the graph using the size-independent (in the ther- 
modynamic limit) probability density of I = n — no, R(l): 



R(l) 



1 dQ(l) 



dl 



The average value and the dispersion of I are equal to 

1 



/ = - 



lnz! 



dxp (x) In x + 7 C 



1 






t 2 / 


In Z\ 


. Jo 



dyg' (y) lny-7e 



(44) 



(45) 
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(l-l)' 



In 2 zi 



In 2 z\ 



dx p (x) In 2 x — 2 



dy 5' (y) in 2 y + 2 



<ixp (x) lnx 



2 . 2 1 

+ t = 



1 2 



(y) lny 



7i~ 

IT 



(46) 



(see Appendix A). 7 e = 0.577216 ... is the Eulcr-Masceroni constant. 

Note that the asymptotic formulae @, <M), @, @, @ and @ are valid only if <p' (t c ) j= and 4>' x (t c ) ~ 
<^>" (*c) 7^ 0. This conditions are violated if ^7(0) = 0, which is the case when t c = and = 1. If <j)' (0) = 0, 
but <t>'i (0) 7^ (vertices of degree one are absent, but vertices of degree two are present), the asymptotics of / (y) 
at large y is again / (y) = Ay~ a , a = ln(l/z c ) /lnzi, z c — <p[ (0) = 211(0) jq. In this particular case, because of 

9 (j/) — 4>[f (y)] ~ [/ (y)] 2 ~ U~ 2a '> m a U formulae a must be replaced with 2a, and, respectively, z c with z 2 . 

The situation is different if (f>[ (0) = <fi" (0) = 0. Assume that the minimal vertex degree in the network is k > 3. 
Then at x — » c>o, (x) = II (fc) £ fe and </>i (x) = fell (fc) x k ~ 1 /q. So, instead of Eq. (p9|), we have at large y: 



f(y) = Ck[f(y/z 1 )] k - 1 , a = ^^<i, 



where Cfe — 1 om y if n (q) = 8r (q — k) (in this case the equation for / can be solved exactly). Its solution is 

tt ^ A-!/(fe-2) / 4 6\ . ln(fc-l) 
/(y) = C fe ; exp (-Ay ) , b^- 



ln Zi 



< 1 



with some constant positive coefficient A. Let us prove that b < 1, i.e. zi > fc — 1. Indeed, 

oo 

gz 1 -9(fc-l) = ^ g (g-fc)n( g ) >0. 

q—k 

The equality is possible only in the case II (q) = 5k (<? — The asymptotics of g (y) at large y is 



(47) 



(48) 



(49) 



g(y) = U(k) [f(y)] k = (l 



[II (fc)] 



-2/(fe-2) 



exp (-By b ) , 



(50) 



where B = kA. The asymptotics of the function p(x) at x — > can be obtained by making the saddle point evaluation 
of the inverse Laplace transform integral in Eq. (jig). We have: 



p(x) S Da; -(2-6)/[2(l-6)] exp (_ Ca -6/(X-6)) ; 

( 6B )VP(l-6)] 



c 



VMM 



(51) 



The asymptotic expressions (|36|)-(|38|) must be replaced with a new ones in this case. For brevity, we present here 
explicitly only the asymptotic expression for the cumulative distance distribution Q (I) when the distance deviation I 
is positive and large, i.e. the one which replaces Eq. (|4^). We have: 

Q(l) = 1- / dxp{x)g{x/v) s 
Jo 

dxx^ 2 -W (1 ^ exp (-Cx-^-V - Bv- b x b ) , 

assuming that the actual region of integration is v -C x -C 1. Here F — (q/k) k ^ k 2 -* [LI (A;)] ~ 2 /^ fc 2 - ) 



1 - F 



Then, the saddle point calculation of this integral gives: 

Q (0 = 1 - Gzi /2 exp 



-ifz 



Z/(2-b) 



(52) 

D and v = z\. 

(53) 



where G and iJ are some positive numbers which can be expressed in terms of b, B, k and II (fc). 



10 



VII. NETWORKS WITH AN EXPONENTIALLY DECAYING DEGREE DISTRIBUTION 



Here we present a two-parameter family of degree distributions, for which one can obtain exact analytical expressions 
for V n (t), and, consequently, for the intervertex distance distribution. These are the degree distributions, for which 
4>i (x) is a fractional linear function. These functions form a group with respect to the operation of functional 
composition. Indeed, the composition of any fractional linear functions is a fractional linear function, and the inverse 
of any fractional linear function is also a fractional linear function. Then, one can look for the solution of Eq. jl^), 
/ (y) 7 in the form of a fractional linear function too. It is more convenient to define a one-parameter family of linear 
fractionals / (x), and then to write <fii as <fii (x) — f (x)] . 

Any linear fractional / (ar), under the conditions / (0) = 1 and /' (1) = —1 may be written as 

f(v)=tc+ $ 1 ~ t £ , (54) 
l-t c + y 

It depends on one parameter t c — f (oo), whose meaningful values belong to the interval (0, 1). Then 4>i(x) may be 
expressed as 

, , s (zi -l)t c +(l- z Y t c )x 

n 0) = — -, tt • (55) 

z\ - t c - (zi - 1) x 

The degree distribution <fi(x) ~ J dx (j)± (x), (f) (1) = 1, can be restored up to an additive constant. Since (f>{0) is 
the fraction of zero-degree vertices, whose effect on the properties of the network is trivial, it is natural to set the 
integration constant so that such vertices will be excluded, <fi (0) = 0. The result is 

4> (x) =/3x+(l-P) ^ { ~^lf , (56) 
where the parameters (3 and £ are connected with z\ and t c as follows: 

(i-/?)C 2 1 



1-C C 2 -/3[(1-C)ln(l-C) + C] ' 



i -fl j-li ln(l-Q + C ,r 7 , 

C_/ C /3[(1-C)ln(l-C)+C]-C 2 - 1 j 



The degree distribution in the original representation is 



H (q) =05(q-l)-[l-8(q- 1)] \ ^ ^ . (58) 

ln(l - q 



In this case, the average vertex degree is 



Z0 = ^ (1) = ^ (i-C)Mi-C) (59) 



and the relative size of the giant connected component is 



^ = 1 - (Q = ln(l-C) + C ■ (60) 
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CDN 






GCC 



FIG. 2. Phase diagram of the model with an exponentially decaying degree distribution. Here the GCC indicates the presence 
of a giant connected component in the network, and the CDN — a completely disconnected network. The capacity (relative 
size) of the giant connected component is m^, — 1 along the line (3 = 0. 

The giant connected component exists if z\ > 1, which in our case corresponds to (3 < (3 C (£), where 

c 3 



&(C) 



C(2C-1)-(1-C) m(l-C) 



(61) 



The phase diagram of the model is shown in fig. ||. It should be noted, that the giant connected component disappears 
if the number of one-degree vertices (dead ends) exceeds some critical value. If these vertices are absent, (almost) the 
entire network is a single connected component. 

The composition function g (x) = <p[f (x)], which is the Laplace transform of the distribution function p(s) of 
s n = Zq (zi — 1) z^ n M n , M n being the size of the n-th connected component for a large but finite n, is 



9 (y) = 1 - m oc + 



l-t c 



l-(3 



111 



(l-t c )/ Zl +y 



y ln(l-C)+C y 
The calculation of the inverse Laplace transform is straightforward: 
p (x) = (1 - moo) S(x) + 0(1 — t c f exp [- (1 - t c ) x] - 
1-/3 



l-tc 



l-t c + 



(62) 



ln(l-C) + C {x 



1 2 

- [exp [- (1 - t c ) x/z{\ - exp [- (1 - t c ) x\] - C(l - t c ) exp [- (1 - t c )x] 



(63) 



Below, for the sake of simplicity, we present the results for (3 = only, when 11(1) = and the "dead ends" are 
absent. (See results for b ^ in fig. g.) In this case t c = and — 1, i.e. the giant connected component (almost) 
coincides with the whole graph. 
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FIG. 3. Cumulative distance distribution function Q(l) in the model with an exponentially decaying degree distribution for 
various values of £ and j3. When £ = 0.2, curves 1, 2, 3, and 4 correspond to f3 = 0.3,0.25,0.15, and 0.05, respectively. If 
£ = 0.4, curves 1, 2, 3, and 4 correspond to (5 = 0.5, 0.4, 0.2, 0.1, respectively. If C = 0.6, curves 1, 2, 3, and 4 correspond to 
/3 = 0.7, 0.5, 0.3, 0.1, respectively. If C — 0-8, curves 1, 2, 3, and 4 correspond to /3 = 0.8, 0.6, 0.4, 0.2, respectively. 



The distribution function V n (jn) = V (n — no, m) depends upon the size of the network through no: 

ln(l-C) + C 



no = 1 + In 



c 3 



(64) 



Its dependence on m can be represented in a parametric form by introducing a parameter i, related to the size of the 
connected component m as 



m (t) = 1 - <f> (t) 



ln[(l-Cf)/(l-C)]-C(l-<) 



M1-0 + CI 



(65) 



Using this parametrization, Eq. ( |33|) can be written as: 

r v (t) 



V{l,t) = v 

i-C* 
(H 2 (1 - 1) 



ml (t) 
exp 



i - 1 



{l-Qv- 



t 



— 11 + C,v — - — cxp -v — - — 



= (i-cr 



Eqs. (|65| ) and (66) determine V (l,m) in a parametric form. At small m, V(l,m) has the asymptotics: 

(i-C)« 



V(l,m) ^ 



Q 2 m 



ir m \ / 171 \ ( m 

cxp - (1 — O v— I — 1 + C^— exp -z^— 



(66) 



(67) 



where q = <p' (1) = C 2 / [(1 — C) l m (1 — C) + C|] i s the average vertex degree. Equation ( |68| ) is valid if m <C 1. On the 
other hand, as 1 — m <C 1, f, we have: 



P(Z,m) 



1-C 



2C 2 (1 - m) 



exp 



2(1 -m) 



(68) 



The cumulative distance distribution Q„ is a function of I = n — hq (N): 
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2(0 



dt \m! (t)\m(t)V (l,t) 



(69) 



We failed to find this integral analytically, but asymptotic expressions can be presented. As I > and I 3> 1, 



Q(l) = l-A(l-l )(l-() 21 , 



(70) 



.4 



2[C + ln(l-C)] 



/n 



|ln(l-C)| 



2(1-C) 



C + ln(l-C) 
C 2 In (1-C) 



(71) 



Here we have (1 — C) 2 ' instead of (1 — m the asymptotics, because one-degree vertices are absent in the network, 
but vertices of degree two are present. On the other hand, as / < and |/| 3> 1, we have 



2(0 



C 2 



hi(l-0 + C 



(i-C) 



-1-2 



(72) 



The position of the center of the distance distribution and its mean square deviation are given by Eqs. (45) and 
(Eq) respectively. Calculating the integrals yields 



In (1-C) 



|ln(l-OI ln(l-0 + C 



(73) 



(l-l)' 



it 2 2 In (1-0 1 

21n 2 (l-0 + 31n(l-0 + C ~ 2 



In (1-Q 
ln(l-0+C 



(74) 



VIII. POWER-LAW DEGREE DISTRIBUTION WITH AN EXPONENTIAL CUT-OFF 



The general scheme, introduced in this paper, is applicable only if the degree distribution has finite first and second 
moments in the thermodynamic limit. If, for example, the degree distribution II (q) is asymptotically a "scale-free" 
one, II (q) ~ q^ 1 , at large g, and exponent 7 < 3, then our considerations fail. In this section we shall consider the 
networks with power-law degree distributions, 2 < 7 < 3, and with an exponential cut-off at large degrees. 

The crucial point of our formalism is to find the general solution of the recursive relation t n +\ — 4>i (t n ), or, more 
precisely, to find how this solution will behave at the large number of iterations n. One can see that this recursion 
relation is easily solvable in the following case: 



1 



7-2 



( j> 1 ( x ) = l-(l-t c )\— T J , (75) 
where 2 < 7 < 3 and < t c < 1. Indeed, we have: 

t n = l-(l-t c )i^—fj , (76) 

that is the analytic form of the solution with any initial condition. However, we have to have finite value of Z\ = (fii (1) 
to apply the general scheme. Then, let us define 



1 - (1 - Q) 7 ' 2 

i-(i-C) 1 



(s) = : K - V 7 - 2 . (77) 



where we set the parameter t c regulating the size of the giant connected component to be equal to zero, which means 
that the giant connected component contains almost all the vertices. The problem can be solved for an arbitrary t c , 
but then the results would look essentially more cumbersome. The parameter £ < 1 corresponds to the cut-off. Indeed, 
the Z-transformed degree distribution (x) may be easily obtained from Eq. (|77|), by integrating its right-hand side. 
We have: 
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(78) 



Then the degree distribution is 



IL(q)= (1-C) 7 ^-1 + C(7-1) 



' X r(7 ) sin7r 7 r(g-7 + l) ^ 



(79) 



from which one can easily see that II (q) ~ q 7 £ 9 at q — > oo. In the following we shall assume l-(Cl. 
We have to find the solution / (y) of the functional equation / (y) = <\>\ [/ (y/zi)], or: 



i-(i-0^ 2 l/(y) = [i-C/(yM)r" 2 



where 



Zl = J>[ (1) = ( 7 - 2) (1 - C) 7 " 3 » 1 • 



(80) 



(81) 



The function / (y) must satisfy the initial conditions / (0) = 1 and /' (0) = — 1. Therefore, for small enough \y\ 
we have: /(y)«l-y, This approximate equality holds when \y\ < \ f'(0) //" (0)| = 1/ |/" (0)| - (1 - C) 7 " 2 (see 
Appendix B). On the other hand, at large enough \y\ one can set in Eq. ( p0| ) £ = 1 (but not zi = oo!). The resulting 
equation 



can be easily solved: 



f(y) = l- exp {-Ax-") 



l-f(y) = [l-f(y/ Zl )r- 2 , 

(3- 7 )ln(l-C) 



1 



In z\ 



d In [1/ (7 - 2)] 



In (7 -2) 



1 > 1. 



(82) 



(83) 



The constant A must be determined by sewing together the expressions for small and large y (see Appendix B), which 
gives: A — 1/e-d. Thus, we have: 



(84) 



f(y) = l-exp[-— 



This formula is valid if \y\ S> (1 — C) 7 2 (Appendix B), i.e. for a large enough cut-off parameter |ln£|, this formula 
is valid almost everywhere except small vicinity of y = 0. Then, g (y) = </>[/ (y)] is: 



g(y) 



1 



exp 



7-2 

The inverse Laplace transform of g (y) is 

7-1/" 1 



(7-l)exp(-^-)+ 7 -2 



p(x) = 



7-2 e 



exp 



"^J- eXP 



(85) 



(86) 



(see Appendix B). The region of validity of this formula is |x| -C (1 — C) 



2-7 
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FIG. 4. Series of the connected-component size distributions V(l,m) in the model with a power law degree distribution for 
various values of the 7 exponent. As the distribution more and more concentrates near m = 1, the parameter I of the curves 
takes the values 0.5, 0, -1.0, -1.5 when 7 = 2.2; the values 1.0, 0.5, 0, -1.0, -1.5 when 7 = 2.4; 2.0, 1.0, 0, -4.0 when 
7 = 2.6; and 4.0, 1.0, -2.0, -5.0, -10.0 when 7 = 2.8. 



Now, let us consider the distribution of the size of the n-th connected component, V n (to). Since it is impossible to 
calculate the inverse of g (y) analytically, we use the parametric form of Eq. (^3|), introducing a parameter t, which 
is connected with to as 



to (t) = 1 - (j> (t) = 



1 



( 7 -l)(l-t)-(l-t) 



7-1 



(87) 



Then we have: 



V n (t) = v n 



f~ V (*) 



to' (t) 



p Wr 1 (*)] , 



where / 1 (t) is the inverse of the function / (y), Eq. ( p4{ ) and v n = zq [z\ — 1) z 1 n N. Combining Eqs. 
(p7|) and (|8q) we obtain: 



V n (t) = 



l-(l-t) 



7-2 



(1-i) In (1 -t) 



exp 



ln(l-t) 



exp 



(7 - 

ln(l -t) 



where 



"0 



(ei?) 

i?lnJV-2(Ini? + l) 
In [1/(7 -2)] ' 



2) n N 



= (7-2) r 



(89) 

(90) 
(91) 



i? is given by Eq. (|33|). Eqs. ( ]87| ) and jp9[) define the distributions of the sizes of the n-th connected component 
V n (m) in the parametric form, see fig. ^TThe order of a connected component, n, and the size of the system, N, 
enter here only in the combination n — riQ (N), i.e. one can write V n (m) = V (n — no, to). One can write the following 

asymptotic expression for V (I, to): at small to, to -C exp — (7 — 2)' 



P(l,m) 



(7-1) (7 -2) 
to In (9/771) 



2/ 



(92) 



1G 



where q — (7 — 1) / (7 — 2) is the mean degree. When to close to 1, or, more precisely, when 1 — m -C 1, (7 — 2) , 
we have: 



P(l,m)K( 7 -2) 



l-i 



7-1 



2(1 -m) 



3/2 



exp < - (7 - 2) 



/ 7-I 
2(1 -m) 



(93) 



Obviously, the distribution is concentrated near to = when I < 0, \l\ 3> 1, and near m = 1 when / > 0, Z 3> 1. 

The intervertex distance distribution actually depends on n — rto = I only. For the cumulative distance distribution 
Q n = Prob (Distance < n) = Q {n — no) we have: 



Q(l)= [ dmmV(l,m)= [ dim! (t)V (l,t) 
Jo Jo 



(94) 



This integral can easily be evaluated, and we obtain 



{(7-1) K X {2^) + K x [2 (7 - 1) /W 2 ] - 2 (7 - 1) 1/2 ^ [2 (7 - 1) 1/2 ^ 1/2 ] } , 



(95) 



where // = (7 — 2)' and i^i is the McDonald function, see fig. |[ For large negative I, using the large argument 
asymptotics of K\ (z) we obtain the following expression: 



Q(l) = 



7-1 
7-2 



tt 1 / 2 (7 - 2) l/i exp -2(7-2) 



vZ/2 



(96) 




FIG. 5. Cumulative distance distribution function Q(l) in the model with a power-law degree distribution for various values 
of the 7 exponent. Curves 1, 2, 3, and 4 correspond to 7 = 2.2, 2.4, 2.6, and 2.8, respectively. 



For a large positive i > 1 we have: 



1 



Q(0 = l-J(7-l) 2 G-M(7-2f In 

z y7 — z 



il 



The hull function <2 (/) is characterized by the position of its center: 

2 



/ 



.™ at 



and by its width 



In (7 -2) 



In 2 (7 - 2) 



7e 



7T 



2 7e - 3/2-2^ In (7-I) 
In [1/(7 -2)] 

ln( 7 -l) ' 
7-2 . 

(7 - 1) In 2 (7 - 1) 
(7"2) 2 



(97) 



(98) 



(99) 



17 



Note that all the results of this section were obtained assuming two limiting transitions. First the size of the 
network tends to infinity, while the cut-off parameter is kept finite. This allows us to apply the general formalism 
based on Eq. (]l^). And only afterwards the cut-off parameter |ln£| tends to infinity. This allows us to obtain the 
solution of Eq. ( |l2| ) in the leading order. The limiting transitions in this section are performed precisely in this 
order. Situations where these two limiting transitions must be performed simultaneously will be discussed in the next, 
conclusive section. 



IX. CONCLUSIONS 



The most crucial restriction in our formalism is that vertices of the network are uncorrelated, so that the network is 
completely defined by a given degree distribution II (q) . This allowed us to trace the evolution of the n-th connected 
component of a vertex as the n is growing. This is possible, however, only if n (q) has finite first and second moments, q 
and q 2 . These networks contain (almost) no closed loops of finite size. Almost all loops are of the order of the average 
intervertex distance in the network, ~ \nN. The problem of the intervertex distance statistics for these random 
network is reduced to the solution of the functional equation (|l2| ) . It is possible to solve it only in some particular 
cases. However, all the asymptotic properties of the distance distributions may be extracted from this equation. 
Undefined constants in the resulting asymptotic expressions (|35|)-(|5^) can be found numerically, if necessary. 

The general results may be summarized as follows: 

1. The average distance d between two vertices in the giant connected component of the network depends on the 
network size N as 

7 In (AN) q 5 , , 

d = — ± '- , Zl = ^ - 1 , 100 

m zi q 

where A is some number. We assume Z\ > 1, which ensures the existence of a giant connected component. 



2. The mean square deviation of the intervertex distance is some finite number a — y (d — d) . That is, in the 
large network almost all vertices in the giant connected component are nearly equidistant from each other: the 
distance is almost certainly d plus or minus a few links. 

3. The (cumulative) intervertex distance distribution is actually a function of d — d, Q (<i — d) . It is nearly as 
its argument is large and negative, d < d, \d — dj ^> 1, and tends to (the probability that two randomly 
chosen vertices belong to the giant connected component) as d > d, d — d ^ 1. (Note that the narrowness of an 



intervertex distance distribution also was observed in other types of networks []27 28 1.) 



4. At large negative I = d — d, the asymptotics of Q (I) is Q (I) ~ z[. This result is evident. Indeed, the average 
size of the n-th connected component of a vertex is fh n ~ z™, which holds when m n = M n /N <C 1 and these 
components are tree graphs. 

5. The asymptotics of Q (I) at large positive I depends on the minimal vertex degree (we assume this degree is 
nonzero in any case). If q m [ n = k = 1, we have 1 — Q (I) ~ lz c , where z c < 1 is some positive number (see the 
beginning of the Section [v|) . If k = 2, the asymptotics is 1 — Q (I) ~ Izf with the same z c . So, in these two cases 
the probability that the distance between two vertices is essentially larger than its average, decays essentially as 

the exponent of the deviation. If, however, k > 3, the situation is different: 1 — Q (I) ~ z 1 / 2 exp ^— Hz l /^ 2 ~^~\ , 

where j3 = In (fc — 1) / lnzi < 1 and H is some positive number. So, this decay is essentially more rapid than an 
exponential one. The origin of this difference is clear. In the first case, k = 1, the giant connected component 
contains some number of dead ends; also, when k — 2, long chains of vertices are present in the giant connected 
component. But, contrastingly, when k > 3, the giant connected component is compact. 

6. We obtain asymptotic expressions for the size distribution V n (jn) for the n-th connected component. From 
this basic distribution, another valuable information about the structure of the network can be obtained. For 
example, from V n (m), one can obtain the length distribution for a closed loops in the network. 



In Sections VII 



an d |VII] our general formalism was applied to networks with specific types of degree distribution 



function. In Section VII we considered a two-parameter family of degree distributions. We chose II (q) ~ ( q /q for 
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g>2,£<l,II(l)=/3<l. A motivation for such a choice was that the function <\>\ (x) is a linear-rational one, 
which allowe d us t o solve the main equation ( |l2| ) analytically. 



In Section VIII we studied the problem: what are the statistics of intervertex distances in the networks with the 
finite first and divergent second moments of the degree distribution? We introduced a degree distribution, which 
behaves as II (q) ~ g _7 C 9 , 2 < 7 < 3, at large degrees q. So, the degree distribution is a power law one in the 
limit We found the leading contributions to the size distribution for the n-th connected component and, 

consequently, to the intervertex distance distribution in this limit. The results may be summarized as follows: 



1. The mean intervertex distance is given by: 



IniV ln|Cln(l-C)l , ^ 

' 2 v , ;| . (101) 



(3- 7 )|ln(l-C)| |hi(7-2)| 

Note that here N — > 00, and 1 — ^ is kept small but finite, so the first term on the right-hand part is the leading 
one. 

2. The intervertex distance distribution actually depends on I = d — d, and the form of this dependence (see Eq. 
(|95|)) appears to be independent of the cut-off position. 

3. The mean square deviation of the distances (Eq. fl99|)) is again a finite number, it depends neither on the 
network size nor on the cut-off. 

4. The probability to find a pair of vertices separated by a distance essentially larger than d exponentially decays 
with I = d — d, or, more precisely, as I (7 — 2) 21 . The probability to find a pair separated by a distance essentially 



smaller than d decays faster than an exponent, namely as (7—2) exp —2 (7-2)'^ (here/<0, |/| » 1). 

Now let us discuss the problem: under what conditions these results remain true if one simultaneously tends to 
infinity both the size of the system and the position of the cut-off in the degree distribution. That is, simultaneously, 
N — » 00 and £ — ► 1. The main question studied in this paper is: how does the size distribution of the ri-th connected 
component changes with its number n? In Z-representation, at sufficiently small n, this evolution is described by Eq. 
( |V . We replace this equation with Eq. ([k|) , provided that the function / is a solution of the functional equation 
(12]). This can be done if, on the one hand, n is large enough, so that Eq. m) may be replaced with its asymptotic 
form, and, on the other hand, n is small enough — the size of the connected component is still essentially smaller than 
the size of the network. 

Let us estimate how large n must be to satisfy the first requirement. For this, let us choose some to close enough 
to 1, so that <pi (to) ~ 1 — Z\ (1 — to). This means that the second term of the Taylor series of <fii (to) near to = 1, 
(1/2)01 (!) (! ~ ^o) 2 , is smaller than the first one. Since zi = 4>[ (1) - (1 - Q 7 ~ 3 and <f>'{ (1) ~ (1 - C) 7 ~ 4 , this is 
satisfied if at least 1 — to < 1 — C- The condition for the replacement of the evolution equation (||) with its asymptotical 
form means that the functions F n (x) and F n+ i (x) can be reduced to each other by the rescaling of the independent 
variable, F n+ \ (x) — F n (z\x). This is true, if after n iterations of the interval of linearity of the function </>i (x), 
(to, 1), to ~ 1 — C, the resulting interval (t n , 1) nearly coincides with its limit at n — * 00, (0, 1). In other words, we 
must require t n <^ I. Outside the interval of linearity we can write: 

1 - t n+1 = 1-01 (t n ) -»■ (1 - t n y 2 . 

Consequently t n = 1 - (1 - £ ) (7 ~ 2) " = (7 - 2)" [In (1 -t )\ ~ (7- 2)™ In (1 - ()\ < 1. Hence 

n > ln|ln(l - C)| . (102) 

On the other hand, the average size of the n-th connected component, M n ~ z™ ~ (1 — £) ( 3 ~ 7 )™ j must be 
essentially smaller than the network size N. This imposes the limitation: 

\nN , . 

n<< Rwr (103) 



Combining inequalities (102) and (103), we get In TV 3> |ln (1 — £)| In |ln (1 — or the restriction to the range of 



values of the cut-off parameter, where our approach is valid: 
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Let us introduce an JV-dependent cut-off, which growth with N, bein g on the boundary of applicability of our 



approach, i.e. | ln[l — C(-^0]| ~ In iV/ In In AT. In this event, instead of Eq. (101), we obtain 



d ~ In In. A 7 ". (105) 

Recall that here 2 < 7 < 3. Also, recall that the position (degree) of the cut-off, go, and the parameter £ are related 
in the following way: ( = e~ x / qo . So, the boundary of applicability of the approach is In go ~ In N/ In In N .) 

For finite networks, a fat-tailed degree distribution function, necessarily has a cut-off, whose dependence on the 
network size is determined by the details of the construction procedure. If this dependence satisfies the condition 



(104), the results of Section VIII arc applicable. That is, as the network size increases, the intervertex distance 
distribution eventually assumes AT- independent shape, but centered at d(N), where In In A" <C d(N) <C In AT. When 
the position (degree) of the cut-off grows with N faster than that on the boundary of applicability of the approach, 
then d increases with AT even more slowly than in Eq. fll05p . 

Let us obtain an estimate of the asymptotic dependence d(N), as A" — ► 00, in the situation when 2 < 7 < 3 and the 
cutoff degree grows with N as a power law, go cx N e . Here, < e < 1. (Again we assume that the giant connected 
component coincides with the entire network.) This behaviour corresponds to 1 — £(AT) ~ N~ e . From Eq. (]l|), we 
sec that zi(N) ~ N< 3 ~">\ The size of the vertex component approaches a value of a finite fraction of N in l\ steps: 
const]^ 1 ~ const2A^ e ( 3-7 ^ 1 ~ constat. So, the "linear" stage of the vertex component evolution is completed in a 
finite number of steps, Zi ~ l/[e(3 — 7)]. 

After this, a point t n in the mapping t n — > i n +i [sec Eq. (pi|)] appears in the region 1 — to' ~1-(<1. Speaking 
more precisely, the last step of the "linear" stage cannot immediately make 1 — t i ~ 1, since it leads, at the worst, 
to the multiplication of the value of 1 — io' by z\ which gives 

Are(3 _ 7 ) iv _ £ _ ^£(7-2) < 1 g 0j there 

is a space 

for the second stage of the vertex component evolution, which is described by the mapping 1 — t n+ \ = (1 — i n ) 7 ~ 2 - 
Consequently, 1 — t n approaches values 1 — t ~ 1 in I2 = ln|ln(l — fo')l/U n (7 — 2) ~ In In AT/ 1 In (7 — 2)| steps. 
d w h + h, so that the mean intervertex distance of networks with degree distribution exponent 2 < 7 < 3 and a 
power-law qo(N) is 

-, . In In A" 

d ^~iM^r (106) 

If the 7 exponent of the degree distribution is equal to 3 and go °c A^ £ , then Z\ oc In N, see Eq. ([!]). The size of the 
vertex component approaches a value of a finite fraction of A" in l\ steps: constiz' 1 ~ const2(ln A^)' 1 ~ constaA^. So, 
the "linear" stage of the vertex component evolution is completed in l\ ~ In / In In A" steps. After this, a point t n in the 



mapping t n — > t n+ \ [see Eq. (24)] appears in the region 1 — to' ~ 1 — C ~ -c 1. The Z transform of the function 
with the asymptote fc~ 3 at large k is <f>{z) = |(1 — z) 2 \ ln(l — z)\ near z = 1, so that <fii(z) = 1 — (1 — z)\ ln(l — z)\. 
So, in the region 1 — ((N) < 1 — t n <C 1, the mapping is 

l-t n+1 = (l-t„)|ln(l-OI- (107) 

It takes h = I ln(l — to')l/ m l m (l — to')\ ~ elnAT/ lnlnA" steps to approach 1 — t n ~ 1. Consequently, the mean 
intervertex distance of networks with degree distribution exponent 7 = 3 and a power-law qo(N) oc A^ e behaves as 



The estimates (106) and (108) support those in Ref. |13j. Note that the resulting asymptotics of the mean inter- 



vertex distance (106) and ( |108| ) are independent of the mean deg ree. This feature may be checked by more detailed 



calculations. Moreover, in the case 2 < 7 < 3, the expression (106) for the mean intervertex distance does not contain 
exponent e. 

In summary, we have developed a consistent formalism for the calculation of statistical characteristics of uncorrelated 
random networks with an arbitrary degree distribution. This formalism accounts for the complex structure of such 
networks. We mainly focused on the intervertex distance distributions, but many other distributions may be studied 
in a similar way. 

S.N.D. thanks PRAXIS XXI (Portugal) for a research grant PRAXIS XXI/BCC/16418/98. S.N.D. and J.F.F.M. 
were partially supported by the project POCTI/99/FIS/33141. A.N.S. acknowledges the NATO program OUT- 
REACH for support. We also thank V.V. Bryksin, A.V. Goltsev, and A. Krzywicki for useful discussions. 
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APPENDIX A: CHARACTERISTICS OF THE DISTANCE DISTRIBUTION FUNCTION 



Taking into account the properties of the functions g (y) and p (x), one can write: 

POO 

Q{l)=m 2 00 dxp(x)g(z{x) , (Al) 



where g is defined in Eq. (|30|) and p is its inverse Laplace transform: 

f°° C +io ° dv 

g(y)= dxp(x)e-*y,p(x)= -JL~ g (y) e *y . ( A2) 

Jo J-ioo 2lTl 

We consider the asymptotics of the distance distribution function Q (I) at large positive I, i.e. at small v\ — z^ 1 . Let 
us choose some xq satisfying the conditions: vi <C xq -C 1. Within the interval (0,xo), one can replace p(x) with 
Bx a ~ x in the integral in Eq. (Al), and one can replace g(y/vi) with B (y/vi)~ a within the interval (xq,oo). Thus 
we have: 

d r^o roc 

m»-2(0 s "i»rn dxx a - l ~g{x/v l )+ml a Bv^ dxx~ a p{x) . (A3) 

Let us replace the integration variable x in the first integral with y — x/vi. Then, in the first integral we repre- 
sent the integrand as d(\ny) y a g(y) and integrate by parts. In the second integral we also write the integrand as 
d (In a;) x l ~ a p (x) and integrate by parts. As a result, we have: 

mi-Q(0 = 

m^Bvf | ^- )9 (xo/vt) (^j " In (^j - J- J^' dy y^ 1 In y [y~g> (y) + ay (y)} - 

p{x )x 1 ~ a h-ix a - [ dx x~ a lnx [xp' (x) - (a - l)p(x)}\ . (A4) 



One can replace g (xq/vi) in the first integrated term with its asymptotics B (xq/vi) a , because x^/vi 3> 1, and p (xo) 
with Bxq~ 1 /r (a) in the second integrated term, because xq •C 1- Also, the upper limit in the first integral may be 
replaced with oo, because this integral is convergent, yg' (y) + ag (y) = O (1/y) as y — > oo. Similarly, the lower limit 
in the second integral may be replaced with 0, because xp' (x) — (a — l)p(x) = O (x) as x — > 0. So, 

^2 D,,ct ( / i \ /-oo 

™ 2 oo-Q(0 -^f{ Bln (-j " y ^2/ Q ^iny[y5'(y) + «5(y)]- 

r (a) J dx X - a In a; [xp' (x)-(a-l)p {x)} j . (A5) 

Each of the two integrals in the braces may be expressed in terms of the other on e. F or example, let us take the first 
integral, and substitute into it g(y), which is the Laplace transform of p(x), Eq. (|A2|). Note that 



yg' (v) = - J dx [xp' (x) + p (x)] e xy . 

Then, changing the order of integrations, we have 

f 

dy y Q_1 lny [yg' (y) + ag (y)] = -/ dx [xp' (x) - (a - 1) p (x)] I dy y a ~ l e~ x% 'lny 



T(a) I dxx- a [\nx-ip(a)}[xp'(x)-(a-l)p(x)} , (A6) 

where ip( a ) — [l n r(a)]'. Quite analogously, 
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oo /-oo 

OC—1 \„.7J , 



dy y"" 1 [yg' (y) + ag (y)] = -T (a) / dx x~ a [xp 1 (x)-(a-l)p (x) 
o Jo 



(A7) 



Then from Eqs. (A.6) and (A7) one can easily obtain 



dxx- a liLx[xp'{x)-(a-l)p(x)] = / dy y^ 1 [lny - V (a)} [yg' (y) + ag (y)] . (A8) 
o Jo 



Substituting either Eq. (A7) or Eq. (A8) into Eq. (|A5| ) and taking into account that vf = z^ al — z l c , we arrive at 
expressions represented by Eqs. (f42|)-(fl3). 

Now let us calculate the first two moments of I. We have: 



r+oc q 

dxp(x) J dll gi9 ( z i x ) 



Changing the integration variable I by y = z[x, I = (lny — In a;) /lnzi, we obtain 

1 



I 



In z\ 



dy g (y)lny+ / dxp(x)hix 



(A9) 



(A10) 



Again, the integrals in the Eq. ( AlOj ) can be mutually expressed. Substituting g from Eq. ( [A2] ) and changing the 
order of integrations, we get 

/>00 />00 /"OO 

dyg'(y)lny = — / dxp(x)x I dye~ xy \ny= / dx p (x) In x + j e , (All) 
o Jo Jo Jo 

where 7 e is the Euler constant. So, we have obtained the expression (^|) for L Analogously we have 
1 



P 



In 2 z\ 



dyg'(y)\n y— / dxp(x)hi x — 2/ dyg'{y)h\y I dxp{x)h\x 
(i Jo Jo Jo 



Quite similarly to Eq. (All) one can obtain 



dyg(y)\ny=— dxp (x) In x — 2j / dxp (x) x — 7 . 

Jo Jo 6 



Then, using Eqs. ([All]) and (|Al|), one can rewrite Eq. ( |A12| ) as 

/2 = 

iff 00 f 00 
— 2 — 1^/ dxp (x) In x + 47 / dxp (x)lnx + 2 
In zi Jo ' Jo 



dxp (x) lnx 



+ 7 2 + y 



(A12) 



(A13) 



In zi Jo 



dy g (y) In y + 4rf dyg (y) In y - 2 



i 2 



dyy' (y)hiy 



uo 



From Eqs. (A10) and (A14), we immediately obtain Eq. (Rq). 



(A14) 



APPENDIX B: SOLUTION OF THE MAIN EQUATION IN THE CASE OF A POWER-LAW DEGREE 

DISTRIBUTION WITH AN EXPONENTIAL CUT-OFF 



Differentiating Eq. ( p0| ) twice, setting y = 0, and taking into account the initial conditions for / (y), one can get 
the expression for /" (0) 



1 {> (zi-1) (1-0 7-2 1 U 



(Bl) 



This means one can use the linear approximation for /(y), / (y) « 1 — y, as long as |y| -C (1 — C) 7 2 ■ The next step 
is to make one iteration in Eq. ( |S0| ) substituting / (y) = 1 — y into its right-hand side. The resulting expression for 

f(y), 
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f(y) = l-(l-C) 



7-2 



C + -y 



i-(i-C) 



7-2 



7-2 



(B2) 



is valid if \y\ < (1 - C) 
reduced to 



7-2 



(1 - C) 27 " 5 - In particular, if (1 - () 7 ~ 2 < \y\ < (1 - C) 27 ~ 5 , Eq. ® may be 



/ (y) = i - 



(7-2) (1-0 



7-3 



7-2 



(B3) 



On the other hand, in this region we have |1 — / (y)\ 3> (1 — £) 7 2 an d |1 — / (x/zi)\ Cl-(. Therefore, if \y\ ^ 
(1 — C) 7 ~ , one can neglect the term (1 — £) 7 ~ / (y) on the left-hand side of Eq. (80), and the term (1 — () f (y/zi) 
on the right-hand side, i.e. simply to set ( = 1 in this equation. As a result, we obtain Eq. (|S2|), whose solution 
is given by Eq. (^) with the constant A to be determined. This can be done, if we take into account that in the 
interval (1 - C) 7 < \y\ < (1 - C) 27 " 5 , both Eq. (pi) and Eq. @ are valid. Let us choose some j/o within this 
interval and represent there the exponent in Eq. (p0[) as Ay~® ~ AyQ^ (1 — $ln(y/yo))- Then one can write 



/(») 



- 

yoj 



exp (-Ay *) 



Comparing Eqs. (B3) and (B4), we obtain: 



x ^e( 7 -2)(l-C) (7 - 3)[1+1/ln(7 " 2)1 ^ = ^. 

ev 

Here the expression (B3J) for $ was used. 

Thus we obtain Eq. (|85|) for g (y) — <p [f (y)), which is valid if \y\ 3> (1 — C) 7 2 - 
Laplace transform. That is, we have to calculate the integral 



(B4) 



(B5) 



Now we must calculate its inverse 



+ioc+<5 



dy 

2m 



1 



exp 



y 



dy 
2m 



- exp xy 



if 



(B6) 



and the other one which differs only by the additional multiple 7 — 1 before the second term in the exponential. The 
integration contour c goes from -00 to along the lower shore of the (— 00, 0) cut and comes bac k al ong the upper 
shore. This equality holds as x > 0. Note that the term in the braces on the left-hand side of Eq. (B6) becomes zero 
as y — > +00, which means that the integral has no ((-functional singularity at x = 0. We assume here |a;| <C (1 — C) 2 7 
and will make use of the smallness of 1?. The actual region of integration is \y\ < 1/x. Here we effectively write: 



y~* x 3 



— [1-0 In (a*)] 



Indeed, because this expression is in an exponential, the criterion is the smallness of the neglected terms compared 
to 1. We can estimate them as dx® In 2 (xy) ~ dx § < § (1 — £) 1? ( 2 ~' 7 ' ~ $ <c 1. Then the integral in Eq. (B6), after 
the replacement of the integration variable, z — xy, turns into 



exp (— or /etf) 



dz 
2%i' 



z x ' e e x 



xT 



x 
e 



exp 



(B7) 



The function on the right-hand side of Eq. (B7) is essentially nonzero if x < 1?, so that one can replace there 
L (-x s /e) -» -e 
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